Corpora compilation for prosody-informed speech processing

نویسندگان

چکیده

Abstract Research on speech technologies necessitates spoken data, which is usually obtained through read recorded speech, and specifically adapted to the research needs. When aim deal with prosody involved in available data must reflect natural conversational costly difficult get. This paper presents a machine learning-oriented toolkit for collecting, handling, visualization of using prosodic heuristic. We present two corpora resulting from these methodologies: PANTED corpus, containing 250 h English TED Talks, Heroes corpus 8 parallel Spanish movie speech. demonstrate their use deep learning-based applications: punctuation restoration translation. The presented are freely community.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prosograph: A Tool for Prosody Visualisation of Large Speech Corpora

This paper presents an open-source tool that has been developed to visualize a speech corpus with its transcript and prosodic features aligned at word level. In particular, the tool is aimed at providing a simple and clear way to visualize prosodic patterns along large segments of speech corpora, and can be applied in any research that involves prosody analysis.

متن کامل

Computing Prosody, Computational Models for Processing Spontaneous Speech

Give us 5 minutes and we will show you the best book to read today. This is it, the computing prosody computational models for processing spontaneous speech that will be your best choice for better reading book. Your five times will not spend wasted by reading this website. You can take the book as a source to make better concept. Referring the books that can be situated with your needs is some...

متن کامل

Object-based Modelling for Representing and Processing Speech Corpora

This thesis deals with modelling data existing in large speech corpora using an object-oriented paradigm which captures important linguistic structures. Information from corpora is transformed into objects and are assigned properties regarding their behaviour. These objects, called speech units, are placed onto a multi-dimensional framework and have their relationships to other units explicitly...

متن کامل

Automatic analysis of prosody for multi - lingual speech corpora . Daniel Hirst

This chapter outlines a general approach and describes a set of tools for the automatic analysis of multilingual speech corpora. Two levels of representation can be derived automatically: a phonetic representation, which provides an extremely close copy of the original speech signal, and a surface phonological representation, which reduces the variability to a small number of discrete values wi...

متن کامل

Emotional Speech Processing at the Intersection of Prosody and Semantics

The ability to accurately perceive emotions is crucial for effective social interaction. Many questions remain regarding how different sources of emotional cues in speech (e.g., prosody, semantic information) are processed during emotional communication. Using a cross-modal emotional priming paradigm (Facial affect decision task), we compared the relative contributions of processing utterances ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Language Resources and Evaluation

سال: 2021

ISSN: ['1574-020X', '1574-0218']

DOI: https://doi.org/10.1007/s10579-021-09556-2